Probabilistic context-free grammars have a long-term record of use as generative models in machine learning and symbolic regression. When used for symbolic regression, they generate algebraic expressions. We define the latter as equivalence classes of strings derived by grammar and address the problem of calculating the probability of deriving a given expression with a given grammar. We show that the problem is undecidable in general. We then present specific grammars for generating linear, polynomial, and rational expressions, where algorithms for calculating the probability of a given expression exist. For those grammars, we design algorithms for calculating the exact probability and efficient approximation with arbitrary precision.
translated by 谷歌翻译
从未标记数据学习的需要在当代机器学习中增加。无监督特征排名的方法,该方法识别这些数据中最重要的特征是越来越关注,因此它们在研究高吞吐量生物实验或用户基础时的应用程序。我们提出了Frane(通过属性网络排名),一种无监督算法,能够在给定的未标记数据集中找到关键特征。Frane基于网络重建和网络分析的思路。正如我们经验上展示了大量基准的那样,Frane比最先进的竞争对手表现更好。此外,我们提供了Frane的时间复杂性分析进一步证明其可扩展性。最后,Frane优惠由于结果可解释的关系结构用于推导特征重要性。
translated by 谷歌翻译